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This study examined a random stratified sample (n=62) of prospective teachers' work across 
eight institutions on three tasks that utilized dynamic statistical software. Our work was guided 
by considering how teachers may utilize their statistical knowledge and technological statistical 
knowledge to engage in cycles of investigation. Although teachers did not tend to take full 
advantage of dynamic linking capabilities, they utilized a large variety of graphical 
representations and often added statistical measures or other augmentations to graphs as part of 
their analysis. 


Purpose of Study 
Dynamic statistical software tools have become more common in schools in the past decade, 
and mathematics teacher educators are beginning to use these tools in courses for prospective 
mathematics teachers. Although the tools are available, teachers’ effective use of these tools in 
classrooms is influenced by their own understanding of how to use the tools to explore statistical 
ideas. In this paper we examine how prospective teachers use representations of data when 
solving statistical tasks using Fathom (Finzer, 2002) or TinkerPlots (Konold & Miller, 2005). 


Theoretical Framework and Background 

Lee and Hollebrands (2008, in press) proposed a framework that characterizes the important 
aspects of knowledge needed to teach statistics with technology (see Figure 1). In this 
framework, three components consisting of Statistical Knowledge (SK), Technological 
Statistical Knowledge (TSK), and Technological Pedagogical Statistical Knowledge (TPSK) are 
envisioned as layered circles with the inner most layer representing TPSK, a subset of SK and 
TSK. Thus, Lee and Hollebrands propose that one’s TPSK is founded on and developed with 
teachers’ technological statistical knowledge (TSK) and statistical knowledge (SK). 

Within Statistical Knowledge is prospective teachers’ ability to engage in transnumeration 
(Wild & Pfannkuch, 1999) as a process of transforming a representation between a real system 
(real-world phenomena) and a statistical system (ways of modeling the phenomena statistically) 
with an intention of engendering understanding (Pfannkuch & Wild, 2004). Thus, teachers 
should be able to collect data, represent them meaningfully with graphs and computed statistical 
measures, and translate their understandings of the data back to the context. Often times, 
transnumeration occurs when data is represented in some way that highlights a certain aspect 
related to the context that can afford new insights into the data. 

Within TSK, our focus is on how prospective teachers can take advantage of technology's 
capabilities to automate calculations of measures and generate graphical displays, and the ways 
they use these graphs and measures to explore data and visualize abstract ideas (Chance, Ben- 
Zvi, Garfield, & Medina, 2007). For example, how do prospective teachers visualize measures 
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(e.g., mean) and graphical augmentations (e.g., shade a region of data, showing squares on a least 
squares line), and how do they take advantage of ways to link multiple representations? 


Statistical Knowledge 

«Engaging in statistical thinking 

«Recognizing need fordata 

*Transnumerating 

«Considering variation 

Tsk «Reasoning with statistical models 
«Integrating the context 


SK 


Technological Pedagogical 
Statistical Knowledge 
«Understanding students’ learning 
and thinking of statistical ideas 
«Conception of how technology 
toals and representations support 


Technological Statistical statistical thinking 
Knowledge «Instructional strategies for 

. . , developing statistics lessons with 
«Automation of calculations and representations technology 
*Emphasis on data exploration “Critical stance towards evaluation 
«Visualization of abstract concepts and use of curricula materials for 
«Simulations teaching statistical ideas with 
«Investigations of real data technology 


«Collaborative toals 


Figure 1. Components of Technological Pedagogical Statistical Knowledge (adapted from 
Lee & Hollebrands, in press). 


Several researchers have studied teachers' use of dynamic statistical tools, usually focused 
within a single professional development experience or course at a specific university (e.g., 
Doerr & Jacob, 2011; Hammerman & Rubin, 2004; Makar & Confrey, 2008; Meletiou- 
Mavrotheris, Paparistodemou, & Stylianou, 2009). Overall these studies have shown that 
dynamic tools can provide opportunities for teachers to increase their approaches to statistical 
problem solving, moving beyond traditional computational based techniques and utilizing more 
graphical based analysis. Teachers using TinkerPlots and Fathom in these studies often 
combined graphical and statistical measures by either adding a measure to a graph or using a 
graph to make sense of a statistical measure computed separately. Such analysis by teachers 
often affords opportunities to consider an aggregate view of a distribution that incorporates 
reasoning about centers and spreads (Konold & Higgins, 2003). Although prior research 
discussed how teachers analyzed data where a link among representations can be inferred, 
researchers often did not focus their analysis on how teachers utilized linked representations. 


Method 

The overarching research question for the study was: When teachers use technology tools to 
solve data analysis tasks, in what ways do they use representations (dynamic and static) to 
investigate these problems? To further research teachers’ use of dynamic statistical tools beyond 
small samples and limited contexts, data were collected from eight different institutions in which 
faculty were using materials (Lee, Hollebrands, & Wilson, 2010) developed by the PTMT 
project’. What is reported here is based on analysis of teachers’ work on three tasks that use 
similar statistical concepts and tools in either TinkerPlots or Fathom (see Table 1). 

The faculty implementing the materials attended a week-long summer institute to become 
familiar with technologies, specific tasks and data sets, and pedagogical issues. Across 
institutions, materials were implemented in a variety of courses, some focused on using 
technology to teach middle or secondary mathematics, and others on statistics for elementary or 
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middle school teachers. The courses predominately enrolled prospective teachers, with a few 
practicing teachers or graduate students. Each teacher worked individually and created a 
document that described his or her work, including illustrative screenshots of ways they used 
technology in solving the task. A total of 247 documents were collected across institutions and 
blinded to protect teacher, faculty, and institutional identity. 


Table 1. Research tasks 


ie | Task as Posed in Materials 


Ch1 Task _ |[Note: Faculty had a choice of two similar data sets to use of state level school data from two 
TinkerPlots |regions of the US. Data included attributes such as average teacher salary, student per teacher 
ratios, expenditures per student] 
Explore the attributes in this data set and compare the distributions for the South and West 
[Northeast and Midwest]. Based on the data you have examined, in which region would you 
prefer to teach and why? Provide a detailed description of your comparisons. Include copies of 
plots and calculations as necessary. 


Explore several of the attributes in the 2006Vehicle data set. 


a) Generate a question that involves a comparison of distributions that you would like your 
future students to investigate. 

b) Use Fathom to investigate your question. Provide a detailed description of your comparisons 
and your response to the question posed. Include copies of plots and calculations as necessary. 


Explore several of the attributes in the 2006Vehicle data set. 

a) Generate a question that involves examining relationships among attributes that you would 
like your future students to investigate. 

b) Use Fathom to investigate your question. Provide a detailed description of your work and your 
response to the question posed in part a. Include copies of plots and calculations as necessary. 


To begin analysis, four documents were randomly selected from each chapter. Through 
iterative discussions by the research team’, examining documents and making sense of 
prospective teachers' work, both top-down methods of Miles and Huberman (1994), and 
grounded theory (Strauss & Corbin, 1990) were used to develop and apply a coding instrument. 
The coding instrument that emerged was based on theory from research on statistical problem 
solving, particularly cycles of exploratory data analysis and typical phases (ask a question, 
represent data, analyze/interpret, and make a decision, e.g., Wild & Pfannkuch, 1999) and use of 
static and dynamic representations in statistics and other domains of mathematics; and on 
categories and codes that emerged from analyzing an initial random sample of teachers' work. In 
the coding procedures, each teacher's work was chunked into smaller cycles of investigation that 
included four phases (Choose Focus, Represent Data, Analyze/Interpret, and Make Decision). 
Within each phase, several categories were used to characterize the work (e.g., number of 
attributes, type of representations, what was noticed, interpretations, type of claim). 

In the second phase of analysis, we examined a larger randomly chosen subset of documents. 
From an initial review of the 247 documents it was obvious that some responses were more 
detailed than others, contained more statistical investigation cycles, and used more 
representations. Thus, each document was classified as either short or long. Short responses were 
typically 1 page and included 1-2 screenshots with minimal explanation. Others were classified 
as a long response. We then conducted a stratified random sample to have proportional 
representation of short and long responses, and to select about 25% of our documents (Table 2). 
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Table 2. Design of stratified random sample 


Chapter 1 Chapter 3 Chapter 4 
Short | Long | Total Short | Long | Total | Short | Long | Total 
Total Documents Collected 52 50 102 12 29 41 38 66 104 
Stratified Random Sample 13 12 25 4 8 12 9 16 25 


Pairs of coders were assigned 8-12 documents to code. All documents were initially coded 
individually, then the pair met to discuss, compare, record inter-rater reliability, and come to 
consensus for each document. During early stages, several coding categories were clarified and 
new categories emerged that were then adopted into the coding procedures by all coders. For 
several categories of codes discussed in this paper, the overall IRR across documents was 0.923 
for coding representations used, and 0.94 for coding measures added to representations. There 
was low initial agreement about coding an augmentation to a graph (0.523), which led to 
discussions to establish agreed upon coding and a better definition of a graphical augmentation. 


Results 

A major advantage of dynamic statistical software is the ability to link representations and 
view different representations and statistical measures or graphical augmentations that can 
perhaps lead to interpretive insights. Thus, to organize our cross-document and cross-chapter 
analysis, we first focused on whether or not a teacher provided evidence of linking 
representations (see Table 3). We considered a teacher was dynamically linking two or more 
representations if there was evidence they had to interact (e.g., click on or drag a data point, 
select a region of data) with data in one representation and use information linked (often 
highlighted) about data in another representation. A teacher was coded as using static 
representations if he coordinated two or more representations without direct interactions. With 
differences noted across chapters, we carefully examined use of representations within chapters. 


Table 3. Number of responses providing evidence of linking representations 
No Evidence of Linking Dynamic Linking 


*Chapter 1 (TinkerPlots) n= 25 13 (52% 11 (42%) 3 (12%) 


) 
Chapter 3 (Fathom) n=12 8 (66.67%) 2 (16.67%) 2 (16.67%) 
*Chapter 4 (Fathom) n=25 14 (56%) 7(28%) 6 (24%) 
*Totals (n=62) 35 (56.5%) 20 (32.3%) 11 (17.7%) 


*Percents do not sum to 100 since 2 cases in Chl and 2 cases in Ch4 were coded as both static and dynamic. 


Representation Use in Chapter 1 Task 

In coding Chapter 1 documents, we did not specifically list a data card of a collection as a 
representation of that data, as it was automatically available in TinkerPlots. Data cards look like 
a stack of index cards, with each card representing a case (e.g., the state of Virginia) and 
containing the values for that case for each attribute (e.g., Average salary, Census region) in the 
dataset. Thus, we considered this as a representation of data readily available. Data Tables and 
Plots, however, were considered representations as teachers needed to drag down a Data Table to 
view a different numerical representation, or use a Plot and construct a visual representation. The 
most common Plots created were dotplots and box plots. When teachers engaged in dynamically 
linking or statically coordinating information in more than one representation, they could have 
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been explicitly using or attending to a data card as a representation in that process. This focus 
was typically evident with those who used dynamic linking. The most common purpose for 
dynamic linking was to identify a particular value for a specific case of interest (e.g., clicking on 
a particular case icon in the graph and using the data card to determine the value of an attribute) 
or to click on a different attribute in the data card that would augment the graph by recoloring the 
data icons for a new attribute of interest. In the first type of linking, teachers were often focused 
on special cases such as those that appeared to be outliers and often situated these cases in 
comparison to the aggregate (see Figure 2 as an example of this type of linking). It was also 
inferred that teachers linked the data card and graphical displays in order to report specific values 
(e.g., data point at Q1) or compute measures such as the range. In the latter type of dynamic 
linking, the addition of a second or third attribute in the plot often increased the complexity of 
analysis as teachers considered relationships among attributes. 


Caf cote cons) —OS— e|4] 7] > 

The next attribute | looked at was the number of high school grads. First of all the 
median for the West only has 17,000 high school grads where the South has about 
37,000 grads. This is a big difference and | think that it basically tells me that the South 
has more students. Also, the West has 1 outlier, California and the south has 2 outliers, 
Florida and Texas. | would not take this into consideration when choosing a place to 
teach because the data would have to be percentage of high school grads for it to mean 
anything. This is just because there are more students in the South. 


Figure 2. Using dynamic linking to identify state names of data points considered outliers. 


In considering ways teachers may have used their TSK, we further examined relationships 
among their actions of augmenting graphs, adding statistical measures to graphs, and whether 
they linked representations. All teachers who dynamically or statically linked representations 
also augmented their graph. Teachers who interacted more with a graph through augmenting 
tended to also engage in linking representations (66.6% of teachers who augmented also linked). 
This pattern was not as strong when considering a relationship between adding a statistical 
measure to a graph and linking representations. Only 50% of teachers who added statistical 
measures to a graph also linked representations, and 3 of the 7 teachers who did not add any 
statistical measures to a graph still engaged in linking representations. 


Representation Use in Chapter 3 Task 

In Chapter 3, teachers were asked to explore several attributes in the data and to determine 
questions for their future students to compare distributions. More than half of the examined 
documents were done and submitted within Fathom, rather than in a Word document, so they 
could leave many of their representations viewable in Fathom and write their responses in a text 
box. Those that did their response in a Word document interspersed their text responses with 
purposeful screenshots of their work in Fathom. The data table was not specifically listed as a 
representation in Chapter 3 documents as teachers used this as a primary view of the data in 
Fathom and for grabbing the label for attributes to graph, similar to how teachers used the 
TinkerPlots data cards in Chapter 1. In one case a teacher used the data table as a dynamically 
linked representation; it was thus coded as a representation. Chapter 3 documents tended to be 
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long (71%) yet had the smallest percentage of teachers across all chapters who showed evidence 
of linking representations. The two teachers that linked dynamically did so to examine particular 
cases and clicked on data points in a graph to locate specific values for attributes in a data table. 
The two teachers that demonstrated evidence of static linking did so near the end of their 
problem solving to examine trends across cycles. Both purposes are the same as in Chapter 1. 
Teachers working on the Chapter 3 task used a wide variety of representations, and often 2-3 
representations per cycle of investigation. They seemed to take advantage of the ability to 
generate multiple graph types (box plots, dot plots, scatterplots), and used Summary Tables 
extensively (many had more than one). Several teachers also added statistical measures (e.g., 
mean, median) to the graphs in Fathom; however in two cases, it seemed they were mainly using 
a graph window as a place to compute a statistical measure (e.g., IQR, StDev) whose location in 
the distribution did not add a way to reason about the measure in relationship to the aggregate. 


Representation Use in Chapter 4 Task 

In coding representations in Chapter 4 documents, again we did not specifically list the data 
table as a representation, as it was used as a primary view of data in Fathom. Six of the 25 
documents sampled from Chapter 4 were completed entirely within Fathom rather than as a 
Word document. Of the 25 teachers, only 44% (n=11) linked their representations statically or 
dynamically. Overall, the most common purpose for linking was to compare the position of 
groups of cases across graphical representations and to make statements about their relationships. 
Of the four teachers that statically linked representations, three linked between two or more 
graphs. This linking typically occurred as a teacher noted the shape of a graph and commented 
whether or not a second graph using a different attribute was similarly shaped. Furthermore, of 
the four teachers that statically linked representations, two linked numerical measures with a 
graphical representation. One of these teachers linked the standard deviation to a histogram in an 
attempt to make sense of magnitude of the standard deviation. There was no evidence that the 
other teacher attempted to make sense of, or interpret the slope of the least squares line that was 
explicitly linked. Five teachers either implicitly or explicitly linked their representations 
dynamically. Three of these teachers linked two graphs, one linked a graph with the data table, 
and another linked the numerical values of sliders (controlling values of coefficients in a model) 
to characteristics of a graph of the model. Two teachers that linked dynamically and statically did 
so by linking several univariate graphs, and one teacher also linked a boxplot to a scatterplot. 

In considering the ways teachers may have used their TSK, we further examined 
relationships among their actions of augmenting graphs, adding statistical measures to graphs, 
and the ways in which they did or did not link representations. Six out of 11 teachers who 
dynamically or statically linked representations also augmented their graph; whereas, 40% of 
teachers neither augmented their graph, nor engaged in any type of linking among 
representations, and 66% of teachers who did not augment their graph also did not link 
representations. Five of the 11 teachers who linked representations also added statistical 
measures to their graphs. Out of 16 teachers who only displayed data graphically, two-thirds (11 
of 16) did not link their graphs. In contrast, all teachers who either statically or dynamically 
linked representations (n=11) used graphical displays. 


Cross-chapter Representation Use 

Teachers in Chapter 3 tended to use several more and unique types of representations of data 
in their response than those in Chapter 1 or 4. This may be an artifact of more than 50% of 
teachers in the Chapter 3 sample submitting their work in a Fathom document. Using Fathom as 
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an analysis and reporting environment may give better insight into all the representations 
teachers used. When reporting work in a separate document and asked to supply screenshots of 
their work, teachers may not report all their representations used. The lack of evidence of linking 
representations may be an artifact of the task posed for Chapter 3 or may be due to the difficulty 
of illustrating linking when many representations are in a single Fathom document. 

The graphs teachers created in Chapter 1 and 3 (tasks about comparing distributions) were 
typically double box plots or dot plots, with a few bar charts and scatterplots in Chapter 3. While 
Chapter | teachers most often used one graph per investigation cycle, teachers in Chapters 3 and 
4 typically used more than one graph per cycle. A much wider variety of representations (simple 
box plots and dot plots, double box plots and dot plots, histograms, and scatterplots) were used in 
Chapter 4 (questions about relationships among attributes), and almost always there was more 
than one representation per cycle. In Chapters 1 and 3, the most common use of dynamic linking 
was to coordinate a single graph with either a data card or data table to find out details about a 
specific case of interest. Only occasionally in solving the Chapters | and 3 tasks did teachers use 
dynamic linking to look at an interval of data. However, teachers who linked representations in a 
dynamic way in Chapter 4 often were comparing the position of groups of cases across graphical 
representations and using such noticing to make statements about relationships. Static linking 
was done across chapters to compare trends in graphs of different attributes across cycles. This 
was often done towards the end as teachers reflected on their work and made a final decision. 

Teachers who dynamically linked representations tended to also augment a graph (75% 
across all chapters). Most of the teachers (71%) who either did not link representations or only 
statically linked representations did not augment a graph. Chapter 3 teachers added more 
summary statistics to provide detail to fully answer the questions they explored. In Chapter 3, 
75% of teachers added statistical measures to summary tables as compared to 28% in Chapter 4. 
A similar percentage of teachers from Chapter 3 and 4 (58% and 56% respectively) added 
statistical measures to graphical representations as compared to 72% of the teachers in Chapter 1. 
This may be attributed to features of TinkerPlots and Fathom. Whereas, Fathom offers the ability 
to summarize statistical measures in a summary table, TinkerPlots affords being able to easily 
incorporate summary statistics, such as measures of center, on graphical representations. 


Discussion 

Given the dynamic nature of Fathom and TinkerPlots, and emphasis in the materials on using 
dynamic linked representations, it was somewhat surprising how few teachers' responses 
provided evidence of either dynamic or static linking among representations. We recognize, 
however, that lack of evidence of linking does not mean that teachers did not engage in this 
activity, just that they did not report their work or findings in a way that we could infer that 
linking had occurred. In addition, it was apparent that those who linked in the Chapter 1 and 3 
tasks did so for similar purposes, often focused on specific individual cases, whereas the Chapter 
4 teachers who linked often did so to examine group propensities. We wonder if this difference is 
related to the nature of the tasks (comparing distributions versus relationships among attributes). 
In the next phase of analysis, it will be important to have a better way of capturing how a 
teachers' use of representations is connected with the complexity of their statistical problem 
solving (i.e., what is a relationship between their SK and TSK). Based on our cross-institutional 
sample, teachers educators may need to provide many opportunities to engage teachers in tasks 
that explicitly encourage dynamic linking. This may facilitate teachers understanding and using 
dynamic capabilities in their statistical work, and hopefully their work with students. 
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